fix(svs): allow all-clear SVS_FLAT bitsets and register raw VAMANA#1603
Conversation
|
@jamesgao-jpg 🔍 Important: PR Classification Needed! For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:
For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”. Thanks for your efforts and contribution to the community!. |
Before: any non-empty BitsetView passed to SvsFlatIndexNode::Search()
caused an immediate `not_implemented` return, even when every bit was
zero (i.e. no vectors filtered out). Callers that pre-allocate a bitset
for shape consistency -- e.g. the vecTool nightly harness always passes
a bitset whose size matches the base set but leaves it all-clear for
plain kNN -- would see every SVS_FLAT search fail at the first query.
After: reject only when the bitset is both non-empty AND has at least
one bit set. All-clear bitsets fall through to the non-bitset search
path, which is the correct semantic for "no filtering requested".
`BitsetView::count()` (include/knowhere/bitsetview.h:50) returns the
number of filtered-out bits / ids, so the added `&& bitset.count() > 0`
is cheap and doesn't scan the bitmap.
Also register raw `SVS_VAMANA` in the SVS registration block so the
factory can construct it, matching the existing `Type()` implementation
and the LVQ / LeanVec registrations in the same file.
No regression on the actual-filter case: when the caller does have bits
set, we still surface the same not_implemented error with the same
message string, matching the current behavior.
Fixes one of the class of failures seen when running the vecTool
newNightly L2 SVS smoke benchmark (all 6 SVS_FLAT groups across
cohere_1m + qwen3vl_200k x {fp32, fp16, bf16} previously failed
identically at search stage; HNSW siblings on the same group IDs
passed). See zilliztech/vecTool#21 for the upstream discussion.
Signed-off-by: jamesgao-jpg <james.gao@zilliz.com>
d6535cb to
b0d5929
Compare
|
@jamesgao-jpg I'm not sure whether e2e-sse keeps on running... |
|
@jamesgao-jpg e2e jenkins job failed, comment |
1 similar comment
|
@jamesgao-jpg e2e jenkins job failed, comment |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alexanderguzhva, jamesgao-jpg The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
issue: #1585 |
What
This PR makes two targeted SVS fixes:
SvsFlatIndexNode::Search's guard fromto
so an allocated-but-all-clear bitset is treated as an unfiltered search.
SVS_VAMANAregistration insrc/index/svs/svs_vamana.cc:KNOWHERE_MOCK_REGISTER_DENSE_FLOAT_ALL_GLOBAL(SVS_VAMANA, SvsVamanaIndexNode, knowhere::feature::NONE)This matches the existing
Type()implementation and the existingSVS_VAMANA_LVQ/SVS_VAMANA_LEANVECregistrations.BitsetView::count()(include/knowhere/bitsetview.h:50) returns the number of filtered-out bits / ids — 0 for an all-clear bitset even whenempty()is false — so the newSVS_FLATcheck is cheap and does not walk the bitmap.Why
These are two separate SVS correctness issues surfaced by vecTool nightly coverage:
BitsetViewmatching the base-set size and leave all bits clear for plain kNN. Before this PR,SVS_FLATrejected that semantic no-op bitset immediately instead of searching.INDEX_SVS_VAMANAandSvsVamanaIndexNode, but only registeredSVS_VAMANA_LVQandSVS_VAMANA_LEANVECin the SVS registration block. That leaves rawSVS_VAMANAconstructible in code but missing from the global registration path.Evidence from vecTool nightly
Running
vecTool's L2 SVS smoke onzilliztech/vecTool@newNightlywithWITH_SVS=ON:SVS_FLATguard fix: all 6SVS_FLATgroups (cohere_1m× fp32/fp16/bf16 andqwen3vl_200k× fp32/fp16/bf16) fail identically at search with:SVS_FLATguard fix in a local build: those 6 groups return search results normally.For raw
SVS_VAMANA, the remote GitHub source showed the registration gap directly:src/index/svs/svs_vamana.ccreturnedINDEX_SVS_VAMANAfromType()but only registeredSVS_VAMANA_LVQandSVS_VAMANA_LEANVECin the registration block.Context: zilliztech/vecTool#21.
Test plan
svs_flat.ccandsvs_vamana.ccSVS_FLATaccepts all-clear bitsets and still rejects bitsets with filtered entriesSVS_VAMANAregistration alongside LVQ and LeanVecSemantics check
bitset.empty() == true→ short-circuit evaluation, falls through (unchanged)bitset.empty() == false && bitset.count() == 0→ falls through (new fast-path, matches other index families)bitset.empty() == false && bitset.count() > 0→ samenot_implementedreturn as today (unchanged)Scope
This PR does not address vecTool nightly policy choices such as which
svs_search_window_sizevalues the downstream matrix runs. It only fixes the Knowhere-sideSVS_FLATbitset semantics and the missing rawSVS_VAMANAregistration.Signed-off-by: jamesgao-jpg james.gao@zilliz.com